The design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines
نویسندگان
چکیده
This paper describes the design and implementation of three core factorization routines—LU, QR, and Cholesky—included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to fit entirely in physical memory. The full matrix is stored on disk and the factorization routines transfer sub-matrice panels into memory. The ‘left-looking’ columnoriented variant of the factorization algorithm is implemented to reduce the disk I/O traffic. The routines are implemented using a portable I/O interface and utilize high-performance ScaLAPACK factorization routines as in-core computational kernels. We present the details of the implementation for the out-of-core ScaLAPACK factorization routines, as well as performance and scalability results on a Beowulf Linux cluster. Copyright 2000 John Wiley & Sons, Ltd.
منابع مشابه
Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines
This paper discusses the core factorization routines included in the ScaLAPACK library. These routines allow the factorization and solution of a dense system of linear equations via LU, QR, and Cholesky. They are implemented using a block cyclic data distribution, and are built using de facto standard kernels for matrix and vector operations (BLAS and its parallel counterpart PBLAS) and message...
متن کاملThe Design and Implementation of the Parallel Out - of - coreScaLAPACK
This paper describes the design and implementation of three core factorization routines | LU, QR and Cholesky | included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to t entirely in physical memory. An image of the full matrix is maintained on disk and the factorization routines transfer sub-matrices into mem...
متن کاملKey Concepts for Parallel Out-of-Core LU Factorization
This paper considers key ideas in the design of out-of-core dense LU factorization routines. A left-looking variant of the LU factorization algorithm is shown to require less I/O to disk than the rightlooking variant, and is used to develop a parallel, out-of-core implementation. This implementation makes use of a small library of parallel I/O routines, together with ScaLAPACK and PBLAS routine...
متن کاملSCALABILITY ISSUES AFFECTING THE DESIGN OFA DENSE LINEAR ALGEBRA LIBRARYJack
This paper discusses the scalability of Cholesky, LU, and QR factorization routines on MIMD distributed memory concurrent computers. These routines form part of the ScaLAPACK mathematical software library that extends the widely-used LAPACK library to run eeciently on scalable concurrent computers. To ensure good scalability and performance, the ScaLAPACK routines are based on block-partitioned...
متن کاملPoLAPACK: parallel factorization routines with algorithmic blocking
LU, QR, and Cholesky factorizations are the most widely used methods for solving dense linear systems of equations, and have been extensively studied and implemented on vector and parallel computers. Most of these factorization routines are implemented with blockpartitioned algorithms in order to perform matrix-matrix operations, that is, to obtain the highest performance by maximizing reuse of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Concurrency - Practice and Experience
دوره 12 شماره
صفحات -
تاریخ انتشار 2000